Re-computing the levels of factor

Problem

You want to do re-compute the levels of a factor. This is useful when a factor contains levels that aren't actually present in the data. This can happen during data import, or when you remove some rows.

Solution

For a single factor object:

# Create a factor with an extra level (gamma)
x <- factor(c("alpha","beta","alpha"), levels=c("alpha","beta","gamma"))
# alpha beta  alpha
# Levels: alpha beta gamma

# Remove the extra level
x <- factor(x)
# alpha beta  alpha
# Levels: alpha beta

After importing data, you may have a data frame with a mix of factors and other kinds vectors, and want to re-compute the levels of all the factors.

# Create a data frame with some factors (with extra levels)
x <- factor(c("alpha","beta","alpha"), levels=c("alpha","beta","gamma"))
y <- c(5,8,2)
z <- factor(c("red","green","green"), levels=c("red","green","blue"))
df <- data.frame(x,y,z)

# Display the factors (with extra levels)
df$x
# alpha beta  alpha
# Levels: alpha beta gamma
df$z
# red   green green
# Levels: red green blue

# Drop the extra levels
df <- droplevels(df)

# Show the factors again, now without extra levels
df$x
# alpha beta  alpha
# Levels: alpha beta
df$z
# red   green green
# Levels: red green